Assessment of content delivery services using performance measurements from within an end user client application

ABSTRACT

A system for measuring and monitoring performance of online content is provided. In one embodiment, the system includes an intermediary device, such as a web proxy, that receives client requests for content, such as requests for web pages. The device obtains the requested content, modifies it by applying one or more performance optimizations, and serves it to the client. The device also inserts code into the content for execution by the client to gather and report data reflecting, e.g., how quickly the client is able to get and process the content. The code includes information identifying the modifications the device made, and this is reported with the timing data, so that the effect on performance can be analyzed. In other embodiments, the device selects one of multiple versions of content, and the inserted code contains information identifying the selected version. The foregoing are merely examples; other embodiments are described herein.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.15/645,706, filed Jul. 10, 2017, which is a continuation of U.S.application Ser. No. 15/261,026, filed on Sep. 9, 2016, which is acontinuation of U.S. application Ser. No. 13/720,636, filed Dec. 19,2012, which claims the benefit of priority of U.S. ProvisionalApplication No. 61/579,674, filed Dec. 23, 2011. The teachings of all ofthe foregoing applications are hereby incorporated by reference in theirentireties.

This patent document contains material that is subject to copyrightprotection. The copyright owner has no objection to the facsimilereproduction by anyone of this patent document, as it appears in Patentand Trademark Office file or records, but otherwise reserves allcopyright rights whatsoever.

BACKGROUND Technical Field

This application relates generally to distributed data processingsystems and to the delivery of content to end users over computernetworks, and more particularly to the measurement and assessment ofcontent delivery services.

Brief Description of the Related Art

Distributed computer systems are well-known in the prior art. One suchdistributed computer system is a “content delivery network” or “CDN”that is operated and managed by a service provider. The service providertypically provides the content delivery service on behalf of thirdparties. A “distributed system” of this type typically refers to acollection of autonomous computers linked by a network or networks,together with the software, systems, protocols and techniques designedto facilitate various services, such as content delivery or the supportof outsourced site infrastructure. Typically, “content delivery” refersto the storage, caching, or transmission of content—such as web pages,streaming media and applications—on behalf of content providers, andancillary technologies used therewith including, without limitation, DNSquery handling, provisioning, data monitoring and reporting, contenttargeting, personalization, and business intelligence.

In a known system such as that shown in FIG. 1, a distributed computersystem 100 is configured as a content delivery network (CDN) and isassumed to have a set of machines 102 distributed around the Internet.Typically, most of the machines are content servers located near theedge of the Internet, i.e., at or adjacent end user access networks. Anetwork operations command center (NOCC) 104 may be used to administerand manage operations of the various machines in the system. Third partysites affiliated with content providers, such as web site 106, offloaddelivery of content (e.g., HTML, embedded page objects, streaming media,software downloads, and the like) to the distributed computer system 100and, in particular, to the servers (which are sometimes referred to asproxy servers if running a proxy application as described below, orsometimes as “edge” servers in light of the possibility that they arenear an “edge” of the Internet). Such servers may be grouped togetherinto a point of presence (POP) 107.

Typically, content providers offload their content delivery by aliasing(e.g., by a DNS CNAME) given content provider domains or sub-domains todomains that are managed by the service provider's authoritative domainname service. End user client machines 122 that desire such content maybe directed to the distributed computer system to obtain that contentmore reliably and efficiently. The servers respond to the clientrequests, for example by obtaining requested content from a local cache,from another content server, from the origin server 106, or othersource.

Although not shown in detail in FIG. 1, the distributed computer systemmay also include other infrastructure, such as a distributed datacollection system 108 that collects usage and other data from thecontent servers, aggregates that data across a region or set of regions,and passes that data to other back-end systems 110, 112, 114 and 116 tofacilitate monitoring, logging, alerts, billing, management and otheroperational and administrative functions. Distributed network agents 118monitor the network as well as the server loads and provide network,traffic and load data to a DNS query handling mechanism 115, which isauthoritative for content domains being managed by the CDN. Adistributed data transport mechanism 120 may be used to distributecontrol information (e.g., metadata to manage content, to facilitateload balancing, and the like) to the servers.

As illustrated in FIG. 2, a given machine 200 in the CDN (sometimesreferred to as an “edge machine”) comprises commodity hardware (e.g., anIntel processor) 202 running an operating system kernel (such as Linux®or variant) 204 that supports one or more applications 206 a-n. Tofacilitate content delivery services, for example, given machinestypically run a set of applications, such as an HTTP proxy 207, a nameserver 208, a local monitoring process 210, a distributed datacollection process 212, and the like. The HTTP proxy 207 (sometimesreferred to herein as a global host or “ghost” application) typicallyincludes a manager process for managing a cache and delivery of contentfrom the machine. For streaming media, the machine typically includesone or more media servers, such as a Windows® Media Server (WMS) orFlash® 2.0 server, as required by the supported media formats.

The machine 200 shown in FIG. 2 may be configured to provide one or moreextended content delivery features, preferably on a domain-specific,content-provider-specific basis, preferably using configuration filesthat are distributed to the content servers using a configurationsystem. A given configuration file preferably is XML-based and includesa set of content handling rules and directives that facilitate one ormore advanced content handling features. The configuration file may bedelivered to the servers via the data transport mechanism. U.S. Pat.Nos. 7,111,057 and 7,240,100 illustrate a useful infrastructure fordelivering and managing CDN server content control information and thisand other content server control information (sometimes referred to as“metadata”) can be provisioned by the CDN service provider itself, or(via an extranet or the like) the content provider customer who operatesthe origin server.

The CDN may include a network storage subsystem for the contentproviders to store and originate content (sometimes referred to hereinas “NetStorage”) which may be located in a network datacenter accessibleto the content servers, such as described in U.S. Pat. No. 7,472,178,the disclosure of which is incorporated herein by reference. For livestreaming delivery, the CDN may include a live delivery subsystem, suchas described in U.S. Pat. No. 7,296,082, and U.S. Publication No.2011/0173345, the disclosures of which are incorporated herein byreference.

As an overlay, the CDN resources may be used to facilitate wide areanetwork (WAN) acceleration services between enterprise data centers(which may be privately managed) and third party software-as-a-service(SaaS) providers.

Given the ability to configure the CDN servers described above, a widevariety of content delivery features may be implemented in the CDNplatform generally and by the CDN servers specifically. For example, aserver may be configured to apply modifications to a given web page asit traverses the server (e.g., going from an origin/source to anend-user client) so as to reduce the number of requests the client hasto make, to reduce the payload of the content, to accelerate clientapplication processing/rendering, to tailor the content for a particularclient device (and its capabilities), or otherwise enhance theperformance and functionality of the content. A wide variety of suchtreatments are known in the art and often referred to as ‘front-end’ weboptimizations or as ‘web content’ optimizations.

By way of example, U.S. Publication No. 2011/0314091 describes systemsand methods for applying performance-enhancing modifications to webpages, and teachings of this publication are hereby incorporated byreference herein. A dynamic image delivery system is described in U.S.Pat. No. 8,060,581, the content of which are hereby incorporated byreference. U.S. Patent Publication No. 2012/0265853 and U.S. PatentPublication No. 2012/0259942 describes systems and methods for streamingmedia and for executing a byte-based interpreter in a proxy server thatcan be used to modify content, to add rights management informationand/or watermarks and the like. The content of all of the foregoingpatent documents are hereby incorporated by reference in theirentireties.

Other performance-enhancing aspects of the CDN platform relate to theability to intelligently map end-user clients to servers, and to theability to intelligently route and manage the transmission of contentacross the network. For example, the CDN may operate a cache hierarchyto provide intermediate caching of customer content; one such cachehierarchy subsystem is described in U.S. Pat. No. 7,376,716, thedisclosure of which is incorporated herein by reference. A transport androuting mechanism for arbitrary data flows is described in U.S. Pat. No.7,660,296, the disclosure of which is hereby incorporated by reference.A system and method for delivery of content using intermediate nodes tofacilitate content delivery is described in U.S. Pat. No. 6,820,133, thecontent of which are hereby incorporated by reference. A global hostingsystem that can utilize a network map is described in U.S. Pat. No.6,108,703, the contents of which are hereby incorporated by reference.

There are many ways to measure the performance of web pages (and of aCDN) in a general sense, be it using synthetic monitoring or so-calledreal-user monitoring from within a browser. However, current performancemeasurement approaches are limited. It would be desirable to be able tobetter show the value of individual features or enhancements offered bya CDN on page performance. Furthermore, it is desirable to improve theability to identify and address performance issues that may be affectingparticular features or particular aspects of the CDN platform, orparticular kinds of end-user clients (particular browsers or otherclient applications or particular devices) served by the CDN, orparticular combinations of the foregoing. The teachings herein addresssuch needs and offers and other benefits and advantages that will becomeclear in view of this disclosure.

BRIEF SUMMARY

This disclosure describes, among other things, a performance monitoringand measurement system which may be implemented within a CDN. In oneembodiment, the system includes an intermediary device, such as a webproxy server, that receives requests for content from clients, such asrequests for a web page. The intermediary device obtains the requestedcontent, modifies it (e.g., by applying one or moreperformance-enhancing optimizations), and serves the modified content tothe client. Though not limiting, the modifications are typicallyproducts and/or features offered by a service provider managing theintermediary device and/or a content delivery network (CDN) of which thedevice is a part. Continuing the example, the device also inserts codeinto the content that will be executed by the client so as to cause theclient to gather timing data reflecting how quickly the client gets andis able to process the content for the end-user, and to report that databack to a back-end processing system. The code further includesinformation identifying the modifications that the intermediary devicemade. This information is included in the reported data, so that theeffect of the modification(s) on performance can be analyzed. Theback-end processing system receives the data and processes it so that itcan be viewed in a user interface (e.g., visualized in various ways toshow the performance improvements associated with the performancefeatures provided by the service provider), analyzed by CDN serviceprovider personnel, and/or fed into mapping/configuration systems inorder to tune the operation of the device and the CDN.

In other embodiments, rather than making modifications to the content,the intermediary device can select one of multiple versions of content,and the inserted code can contain information identifying the selectedcontent version, so that performance data can be correlated with contentversions selected by the intermediary. In other embodiments, theintermediary device can send the content with or without modification,and the inserted code can contain information relating to or identifythe intermediary device, devices from which the intermediary deviceobtained the content, or information about the client (e.g.,characteristics of the client device, etc.), so that performance datacan be correlated with these factors.

As those skilled in the art will recognize, the foregoing merely refersto examples provided for purposes of overview and introduction. They donot represent all possible variants or embodiments. Other embodimentsare described below. The foregoing is not limiting, and the teachingshereof may be realized in a variety of systems, methods, apparatus, andnon-transitory computer-readable media. The appended claims define thesubject matter for which protection is sought. It should also be notedthat the allocation of functions to particular machines is not limiting,as the functions recited herein may be combined or allocated amongstdifferent machines in a variety of ways.

BRIEF DESCRIPTION OF THE DRAWINGS

The disclosure will be more fully understood from the following detaileddescription taken in conjunction with the accompanying drawings, inwhich:

FIG. 1 is a schematic diagram illustrating one embodiment of a knowndistributed computer system configured as a content delivery network;

FIG. 2 is a schematic diagram illustrating one embodiment of a machineon which a content server in the system of FIG. 1 can be implemented;

FIG. 3 is a schematic diagram illustrating one embodiment of aperformance monitoring and measurement system for a CDN;

FIG. 4 is a schematic diagram illustrating one embodiment of a userinterface for a web portal for configuring and otherwise interactingwith the system;

FIG. 5 is a schematic diagram illustrating one embodiment of NavTimingpage load phase breakdown, as it might appear in the portal;

FIG. 6 is a block diagram illustrating hardware in a computer systemthat may be used to implement the teachings hereof.

DETAILED DESCRIPTION OF EMBODIMENTS

The following description sets forth embodiments of the invention toprovide an overall understanding of the principles of the structure,function, manufacture, and use of the subject matter disclosed herein.The systems, methods and apparatus described herein and illustrated inthe accompanying drawings are non-limiting examples; the scope of theinvention is defined solely by the claims. The features described orillustrated in connection with one exemplary embodiment may be combinedwith the features of other embodiments. Such modifications andvariations are intended to be included within the scope of the presentinvention. All patents, publications and references cited herein areexpressly incorporated herein by reference in their entirety.

1.0 Performance Measurement and Monitoring System

FIG. 3 illustrates, at a high-level, an embodiment of a performancemeasurement and monitoring system 300 which can be built as an overlayto the CDN shown in FIGS. 1-2. FIG. 3 shows an intermediary device 302,which is preferably (though without limitation) one of the machines 102configured as a proxy server, as shown and described above with respectto FIGS. 1-2. Assume that the client 306—in this example a web browser306 running on an end-user machine—has made a request (e.g., an HTTP‘Get’ request) for a particular web page to the proxy server 302 afterhaving been mapped to that proxy server via a DNS system lookup. Theproxy server 302 obtains the page from the origin/source server 301, orfrom an internally cached copy that was previously retrieved. In theillustrated embodiment, the functionality of the proxy server 302 isextended to insert measurement code 304 into the web page (e.g., an HTMLfile, although other content and/or markup language file formats couldbe used) that was obtained by the proxy server 302 and is beingdelivered to the client application. The insertion of the code can becontrolled and customized on a content provider by content providerbasis (or even a page by page basis), using the metadata approachdescribed previously. The code enables and facilitates the collection ofperformance data from within the client application and the reportingback of this data to the CDN. The CDN can then provide this informationto content providers (e.g., through an extranet portal), report theinformation for internal use, and/or act on this information directly.

It should be noted that the insertion of code can be performed by anyintermediary device, and is not limited to the example of CDN proxyservers, which are merely examples. The intermediary might be anotherdevice or a software module residing in the communication path betweenthe content origin and the client. Moreover, the client can be anyapplication and is not limited to a browser 306, though that is usedherein for illustrative purposes.

In this embodiment, the inserted code 304 is preferably a small,relatively fixed body of JavaScript. This code will gather timing dataand cause the browser 306 to send it back to the CDN as query stringarguments on an http GET request, referred to herein as a beacon. In oneembodiment the beacon is sent on a content-provider-controlled hostname;alternatively, the beacon can be sent to the hostname of the base page(the page on which the JavaScript is loaded), which will save a DNSlookup and a TCP connect in the common case.

As mentioned previously, a configuration (config) for how to deliver thebeacon may be provided to the proxy server. In many cases, it isdesirable to inline all the necessary JavaScript in the base page.However, if the base page grows too large, an alternative is to inline ascript that asynchronously loads the main body of code. In this case asimple object-delivery configuration for delivering thebrowser-cacheable JavaScript to the end user from the proxy server ispossible. Preferably, this will be on a provider-controlled hostname toavoid multiply caching the object at proxy server.

As shown in FIG. 3, there is also provided a back-end system 308 forreceiving the beacons generated at the browser, producing downloadreceipts, forwarding them to a processing engine, and inserting summaryresults into a database. Preferably, the download receipt contains thefull beacon URL and client IP address. It should be appreciated that inmany cases, the system can be set up so that the client 306 sends thedata back to the server 302, which then relays it to the back-end system308, since the server 302 may be closer to the client 306.

A visualization system 310 extracts data from the database on-demand andproduces charts for display to participating content providers in theservice provider extranet web portal.

A control system 312 can be used to process the performance informationand identify and address performance issues affecting the operation andperformance of the CDN platform. The system may identify, for example,that certain web page optimizations/treatments are causing a problem fora particular content provider or particular category of client (e.g., aclient device running a particular browser). It might determine, by wayof further example, that a particular set of end-user clients isreceiving poor performance because certain proxy servers are functioningsub-optimally, or those proxy servers are being mapped to sub-optimalcache hierarchy parents, or those clients are being mapped sub-optimallyto the proxy servers, or otherwise. Using this kind of information, newcontrol information to address the performance issue, in the form of anupdated configuration file for example, can be fed back to the proxyserver 300 (or to another component of the CDN such as the mapmaker inFIG. 1).

2.0 Gathering Performance Data from Browsers

There are several ways for the inserted code 304 to access and collectperformance data from the browser 306. JavaScript has some ability togather performance data; for example, there are universally-supportedhooks for millisecond-granularity timers and for registering a functionthat gets executed when the browser considers the page to be fullyloaded (e.g., the “onload” event). This allows, for example, theinserted code to grab a timestamp as the first step in the <head>section of a page, grab another when the onload event fires, and computethe difference. This does not capture network time (DNS, connect,first-byte), however, and is dependent on browser operation with regardsto the firing of the onload event.

Browsers have varying abilities to gather more detailed data. Manydifferent techniques are available to capture the point in time at whicha page becomes “ready” for use by the end-user, and to capture estimatesof network time. These techniques are generally browser-specific andknown to those skilled in the art. Many of them have been incorporatedinto general-purpose libraries like jQuery and Boomerang.

Browsers that are compliant with W3C Navigation Timing (referred toherein as “NavTiming”) or Resource Timing (referred to herein as“ResTiming”) specifications offer another option for gatheringperformance data. These specifications describe a framework via which abrowser can directly export page timing data. These specificationsspecify a framework via which a browser can export timing data about thebase page and about embedded objects. As those skilled in the art willappreciate, more information about the specifications can be found atthe W3C website.

From the browser's point of view, an HTTP request for a page can bebroken into a series of phases: Prompt for unload, Redirect, App cache,DNS, TCP, Request, Response, Processing, and onLoad. A browserimplementing the Navigation Timing specification makes available atimestamp (milliseconds since the epoch) at a variety of points in eachphase, including the start and stop time of each phase. Notably, fivedistinct timestamps are made available during the “Processing” phase ofthe transaction: start time, the time at which the page becomesinteractive to the user, the time at which the browser fires thecontent-loaded event, the time at which the content-loaded eventcompletes, and the time at which the document object model (DOM) iscomplete in memory. The specification provides more precise definitionsfor all these events.

The Resource Timing specification provides timing data for each embeddedobject on a page. A variety of timestamps (including start and stoptimes) for the phases of Redirect, App Cache, DNS, Request, Response.The level of detail in the Resource Timing specification is availableonly when the server sending the embedded object includes a particularresponse header with that object. In the absence of that header, thebrowser makes available only the end-to-end timing for each embeddedobject. With a content provider's permission, however, a CDN proxyserver can insert the header on any CDN-directed object on a hostnamecontrolled by the same content provider as the base page.

The data available through the Navigation Timing/ResourceTiming-compliant browsers may not correlate exactly to that obtainedthrough other approaches described above. Therefore, preferably thesystem 300 displays results from Navigation Timing data separately fromnon-Navigation Timing data.

3.0 Implementation

This section describes example implementations of the system components.

3.1 Beacon

The inserted code (a snippet of JavaScript in the current example)supports data collection in the case where NavTiming is present in thebrowser and the case where it is absent. In the former case theJavaScript does nothing more than to collect up the data and send itback to the service provider. In the latter, the JavaScript can grab atimestamp at the beginning and the end of the page and at theDOMContentLoaded event (where supported), and optionally execute anasynchronous fetch of a small test object from the edge. The back-end308 will use the timing of the test object fetch to infer the round triptime between the browser and the proxy server, as observed by thebrowser.

3.1.1 Wrapper

“asbw” (Site Beacon Wrapper) is several lines of code whose function isto download the main body of the JavaScript (sitebeacon.js) in a waythat minimizes performance impact on the content provider page. It doesthis by creating a new script element and adding it to the DOM, whichallows the script to be fetched in parallel with other page activity.This code also grabs a timestamp used when NavTiming is not available.This approach is used if it is infeasible to inline the main body of thedata collection script.

If used, this body of code can be inlined directly as the first elementof the <head> section in the content provider page.

3.1.2 Site Beacon

Sitebeacon.js supports both the NavTiming and non-NavTiming cases.Support for NavTiming can be inferred by the proxy server from the UserAgent header sent by the client 306 with the request. There may be onescript for NavTiming and one for Non-NavTiming browsers. The codeattaches a function to the onload event. When that function executes itdetermines whether NavTiming is available, and if so collects a subsetof the timing data. When NavTiming is not available the script computesDOMContentLoaded and the onload event timestamps, and then optionallyfetches a small test object from the same hostname. The back-end willuse the elapsed time of the fetch as an estimate of the round trip time(RTT) to the proxy server. In either case, the JavaScript sends the datato the back-end system 308 via query strings on an HTTP ‘Get’ request.It uses a detached <image> tag for this send, so no cross-domainrestrictions are imposed by JavaScript.

3.1.3 Beacon Data

The beacon payload may vary, but for illustrative purposes, an exampleof a base set of information to be carried is tabulated below. All theNavTiming data is expressed as the delta in milliseconds between thedomainLookupStart timestamp and some other NavTiming timestamp. Absolutetimestamps can be used, but are not preferred since they are based onthe end-user's clock. To use absolute timestamps, though, the rawdomainLookupStart time can be sent back, in addition to the other data.The NT column indicates whether or not the datum is derived from theNavTiming data, which clarifies what data would be received from non-NTbrowsers. All integers are signed in this implementation. The back-end308 can discard malformed data such as strings in the integer columns orstrings of excessive length.

FIELD TYPE EXPRESSION NT MEANING dnsE IntdomainLookupEnd-domainLookupStart Y DNS lookup time conS IntconnectStart-domainLookupStart Y Connect start time sslS IntsecureConnectionStart- Y SSL start, blank for http domainLookupStartconE Int connectEnd-domainLookupStart Y Connect end time reqS IntrequestStart-domainLookupStart Y Request start time respS IntresponseStart-domainLookupStart Y Response start time respE IntresponseEnd-domainLookupStart Y Response end time domL IntdomLoading-domainLookupStart Y DOM Loading time domI IntdomInteractive-domainLookupStart Y DOM Interactive time domCLS IntdomContentLoadedEventStart- Y DOM CL event start domainLookupStartdomCLE Int domContentLoadedEventEnd- Y DOM CL event completedomainLookupStart domC Int domComplete-domainLookupStart Y DOM Completetime leS Int loadEventStart-domainLookupStart Y Onload event start timeleE Int loadEventEnd-domainLookupStart Y Onload event done time dclE IntMeasured DOMContentLoaded - head N Time to DOMContentLoaded start event7as measured by js plt Int Measured onload start - head start N Page LoadTime as measured by js testo Int Measured testobj end - testobj start NTest object fetch time plat String navigator.platform N Platform stringfrom BOM ua String navigator.userAgent N User Agent string from BOM uriString location.href N Full URI of base page hs Int Constant set bymetadata N HTTP Status code for base page ver String Constant set bymetadata N Beacon version number

3.2 Beacon Rate Control

The design described herein gives control over the beacon insertion rateto the content provider or the service provider. The insertion rate iscontrolled because the beacon has the potential to impact pageperformance and so the content provider will likely want to restrict itto a fraction of all page loads. Further, the beaconed data has thecapability of overwhelming the back-end system 308 with data;restricting the insertion rate is a way to control load on the back-endsystem 308.

Rates can be set depending on the page being delivered, the type ofclient device (e.g., OS, browser, device type such as mobile) requestedthe page. Beacon insertion rates may also be controlled by whether ornot the proxy server is applying some feature (e.g., a web contentoptimization) to the page.

3.3 Configuration and Insertion of the Code

Edge node-supported code is used to insert the body of asbw.jsimmediately after the <head> tag in the page markup language.

The extranet portal can have a section that configures the metadata toconditionally insert the JavaScript. The portal can have a configurationpage that allows the content provider to control the following:

-   -   A method for selecting pages into which the beacon should be        inserted. It can support controls by filename, path, and        extension.    -   For each such URL/path/extension, three distinct rate controls:        NavTiming-enabled browsers, other Browsers, and mobile devices.        For each of these the content provider can specify the        percentage of all page views that should receive the JavaScript.    -   When the rate control for non-NavTiming data is nonzero, the        content provider can optionally specify a test object to be        fetched from JavaScript asynchronously to the page load, for        purposes of estimating the round trip time (RTT). If none is        specified, none will be fetched and no RTT data will be        available. When an object is specified the portal will add        metadata that configures this object to return an empty body via        a construct-response with status code 200. This avoids data        pollution in the case when the test-object fetch takes a cache        miss. The portal can suggest a test object name by pre-filling        the form field with a file that is unlikely to conflict with any        actual object, such as        http://<hostname>/CDN_test_object_for_rum.txt.

Configuration might be controlled on many parameters beyond those above:cookies, A/B testing, specific browser types, geographies, and so on.

For illustration (only), FIG. 4 is a wireframe of what a web-based userinterface might look like in the portal.

From this (and again only for illustration), the configurator mightgenerate metadata control information that looks like the followingpseudo-code:

<match file type is html> <match response status is 2xx> <set-variableRATE=0> <match path matches lowest-numbered-path><match:devicecharacteristic NavTiming result=”false”> <set-variableRATE=X> </match> <match:devicecharacteristic NavTiming result=”true”><set-variable RATE=Y> </match> <match:devicecharacteristic MobileDevice><set-variable RATE=Z> </match> </match> [ Repeat the above block foreach configured path, in order ] <match random( ) less-than RATE><edgecomputing:akamaizer.tag-filter> <rule>#(&lt;head&gt;)# [ body of jsto insert ] #</rule> </edgecomputing:akamaizer.tag-filter> </match></match> </match>

In a preferred embodiment, the system maintains a base rate of beaconinsertion and then for certain content-provider-specified categories oftraffic, it increases the insertion rate so as to ensure the sample sizeof data for those categories is sufficiently large to be useful (e.g.,high resolution, statistically significant) in performance analyses. Toaccomplish this, in one embodiment the system activates the beaconinsertion function on all pages of a content provider's site, except fora set of pages that the content provider can optionally specify to beexcluded. As the CDN serves pages to clients, the system maintains thebeacon insertion rate within certain bounds. Since the CDN is servingthe pages to end-user clients, the CDN knows the historical page-requestrate and the capacity of the back-end system 308 to handle theperformance data that will be sent by the beacons. The CDN can use thesetwo pieces of information set the beacon insertion rate withinacceptable bounds, e.g., so that the back-end 308 is not overwhelmed.Through the portal, the content-provider can specify that particulartraffic categories of interest (e.g., certain geographies, browsers,feature sets, etc.), and the CDN can increase the beacon insertion ratefor those categories. For example, as a given proxy server 302 in theCDN fields client requests, it will determine whether a given requestfalls within a particular category (e.g., by mapping client IP addressto geography, by examining the user-agent header to determine browser orother client device characteristic, or any such technique known in theart). For requests falling within the category, the proxy server 302inserts beacons at a particular rate ‘N’. For requests falling outsidethe category, the proxy server 302 inserts beacons at another rate ‘M’,which would usually be lower than ‘N’.

3.4 Delivery of sitebeacon.js

The configuration that delivers sitebeacon.js to the browser is anobject delivery service of the CDN service provider.

3.5 Back-End

In the current embodiment, an infrastructure (servers, databases,software, APIs, and so forth) is used for receiving beacon datagenerated from end-user clients 306. The beacon data is sent from thebrowser by executing an http GET on a hostname private to the companycollecting the data; that hostname points to the back-end 308. Theconfiguration validates any associated authentication token and sends adownload receipt to additional back-end infrastructure, where it isconsumed by a processing engine that writes the data to a database.

3.6 Visualization

Recall the wireframe diagram of the configuration page shown in FIG. 4.Clicking on one of the paths can take the end-user to the visualizationconsole for that path.

The visual layout of the console might include a set of tabs via whichvarious different graphs can be selected. The interface preferablysupports the metrics listed below, in one embodiment. In each case, theportal user can have the option of selecting subsets of these parametersto be displayed on the same graph, and of saving that report definitionfor future use.

-   -   NavTiming average page-ready timing data: A graph giving average        values for page-ready timing metrics that the customer might be        interested in. These will be line graphs with time on the X        axis. Some parameters of interest are:    -   End-to-end time: defined as loadEventEnd—domainLookupStart    -   Onload event time: loadEventStart—domainLookupStart    -   DOM complete time: domContentLoadedEventStart—domainLookupStart    -   Page interactive time: domInteractive—domainLookupStart    -   Download compete time: responseEnd—domainLookupStart    -   NavTiming 90th percentile page-ready timing data: the same graph        as above, calculating the 90th percentile values of the metrics        rather than the averages.    -   Non-NavTiming average page-ready timing data: A graph with time        on the X axis and three lines:    -   Onload event time: elapsed time from the first execution of        inlined JavaScript to the onload event    -   DOMContentLoaded time: elapsed time from the first execution of        inlined    -   JavaScript to the DOMContentLoaded event, if any such data        exists    -   Estimated round trip time from client to edge proxy: elapsed        time to fetch the test object, if any such object has been        configured    -   Non-NavTiming 90th percentile page-ready timing data: the same        graph as above, calculating 90th percentile values of the        metrics rather than averages.    -   Sample counts: a count of the number of samples that went into        the plots being displayed, for both NavTiming and non-NavTiming        plots, as specified by the current filter.    -   NavTiming page load phase breakdown: a graphic showing the        average values for the components of the HTML transfer, as shown        in FIG. 5.

The infrastructure provides support for reporting and viewing the databy population group rather than globally. All of the visualizationsmentioned in the above section may be available broken down along thefollowing axes:

-   -   Country-level geography, and/or state- or metro-area-level.    -   Connection speed category    -   Connection type (e.g. cable, DSL, mobile)    -   Platform (e.g. iPhone, Android, Windows, Linux_x64) as returned        in the platform field of the beacon data.    -   Browser+version (e.g., IE9), or just browser (e.g., IE), or if        neither of those is possible, User Agent.

Each of these can be selected independently of the others, making allcombinations possible.

4.0 Custom Beacon Fields

The utility of the system 300 can be extended by enabling the insertionof certain custom fields into the beacon code for reporting back withthe beacon data, for use by the visualization system 310 or themapping/control system 312 of the CDN. For example, the proxy server 302can be configured to construct a variable indicating some information ofinterest, and the value could be inserted into the page with theJavaScript block in the <head> section, and subsequently collected alongwith the NavTiming or other data and sent to the back-end. The back-end308 would recognize it as a custom field and automatically producevisualizations computed over the data on each branch of the test. Thevalue might indicate a wide variety of information relating to thecontent delivery process, such as:

-   -   What web content optimizations the proxy server 302 applied to        the page. As previously described (and as noted in U.S.        Publication No. 2011/0314091), a wide variety of        performance-enhancing modifications can be automatically applied        to a page (or other content). Specific examples of such        treatments include in-lining content, resource consolidation,        minification, image optimization, domain sharding, version        control for cacheability, just-in-time loading, adjusting the        time at which page scripts are run, device-adaption        modifications, and so on. Other treatments might involve        compression and/or de-duplification of content. The proxy server        can embed identifiers in the beacon to indicate what content        optimization treatments it applied to the page, or more        generally, what features of the server and CDN (e.g., possibly        corresponding to products offered by the CDN) were engaged when        processing the page. The resulting data can be interpreted and        visualized to indicate the effect of particular        optimizations/features and combinations thereof on each page of        a content provider.    -   What version of a page was selected and delivered by the proxy        server 302. Thus the variable could indicate whether an A or B        side of a test was executed at any given page load.    -   Information relating to the proxy server 302, the CDN platform,        and/or delivery circumstances. For example, the information can        include an identifier for the given proxy server, or for the CDN        region or point-of-presence of the CDN within which the proxy        server resides. Further, information about where the proxy        server 302 fetched the page (e.g., from local cache, from a        particular cache-hierarchy parent or PoP, from NetStorage, or        from a content-provider-managed-origin server, etc.) Note that        this information might also reflect where the proxy server 302        has or will fetch embedded objects on the page, and thus be        useful in assessing performance related to the delivery of        embedded objects. The information might also identify which        intermediate nodes within the CDN are being used for routing        content (the page or the embedded objects) to the proxy server        for delivery to the client. The resulting data can be        interpreted and visualized to indicate whether certain servers        or clusters or network functions are presenting performance        issues. The data can also be sliced to focus on particular        clients or categories of clients (e.g., certain browsers, mobile        devices, etc.).    -   Information relating to the client and/or determined from the        client's request. In addition to client IP address and other        client identifiers, information derived from client request        headers can be included in the beacon. Examples include a        determination of geography based on client IP address, and a        determination of client device identity or characteristics based        on, e.g., the user agent. In the latter case, the proxy server        can run an edge device identification module that ingests the        user agent given by the client with its initial request for        content, and based thereon returns a client device identifier        (i.e., which may be an identifier internal to the CDN) and/or        characteristics of the device like operating system, physical        dimensions, etc. The resulting performance data can be        interpreted and visualized to indicate the performance (or lack        thereof) for certain clients or categories thereof.

To enable the insertion of custom values, the proxy server's requestprocessing function may expose internally what features it had appliedto any given page load. The server could include these variables asinternal uses of the custom beacon fields feature.

Visualization based on these fields can be made available to networkoperations personnel (e.g., in the NOCC) as well as content providers onthe portal. The information can be used to validate the effect offeatures, to make recommendations about configuration of the featuresthat the content provide has signed up for, and/or making changes to thecontent itself, to improve performance.

5.0 Conversion Rate Tracking

In a further embodiment, the system 300 can be extended beyondperformance and into conversion rates. The portal could allow commercecontent providers to specify that certain URLs or combinations of URLsindicate that a conversion event (e.g., an online sale) occurred, andthe visualization system might give feedback on conversion rates as afunction of performance or as a function of certain CDN features.

6.0 Use of Computer Technologies

The clients, servers, and other computer devices described herein may beimplemented with conventional computer systems, as modified by theteachings hereof, with the functional characteristics described aboverealized in special-purpose hardware, general-purpose hardwareconfigured by software stored therein for special purposes, or acombination thereof.

Software may include one or several discrete programs. A given functionmay comprise part of any given module, process, execution thread, orother such programming construct. Generalizing, each function describedabove may be implemented as computer code, namely, as a set of computerinstructions, executable in one or more processors to provide a specialpurpose machine. The code may be executed using conventionalapparatus—such as a processor in a computer, digital data processingdevice, or other computing apparatus—as modified by the teachingshereof. In one embodiment, such software may be implemented in aprogramming language that runs in conjunction with a proxy on a standardIntel hardware platform running an operating system such as Linux. Thefunctionality may be built into the proxy code, or it may be executed asan adjunct to that code.

While in some cases above a particular order of operations performed bycertain embodiments is set forth, it should be understood that suchorder is exemplary and that they may be performed in a different order,combined, or the like. Moreover, some of the functions may be combinedor shared in given instructions, program sequences, code portions, andthe like. References in the specification to a given embodiment indicatethat the embodiment described may include a particular feature,structure, or characteristic, but every embodiment may not necessarilyinclude the particular feature, structure, or characteristic.

FIG. 6 is a block diagram that illustrates hardware in a computer system600 upon which such software may run in order to implement embodimentsof the invention. The computer system 600 may be embodied in a client,server, personal computer, workstation, tablet computer, wirelessdevice, mobile device, network device, router, hub, gateway, or otherdevice.

Computer system 600 includes a processor 604 coupled to bus 601. In somesystems, multiple processor and/or processor cores may be employed.Computer system 600 further includes a main memory 610, such as a randomaccess memory (RAM) or other storage device, coupled to the bus 601 forstoring information and instructions to be executed by processor 604. Aread only memory (ROM) 608 is coupled to the bus 601 for storinginformation and instructions for processor 604. A non-volatile storagedevice 606, such as a magnetic disk, solid state memory (e.g., flashmemory), or optical disk, is provided and coupled to bus 601 for storinginformation and instructions. Other application-specific integratedcircuits (ASICs), field programmable gate arrays (FPGAs) or circuitrymay be included in the computer system 600 to perform functionsdescribed herein.

Although the computer system 600 is often managed remotely via acommunication interface 616, for local administration purposes thesystem 600 may have a peripheral interface 612 communicatively couplescomputer system 600 to a user display 614 that displays the output ofsoftware executing on the computer system, and an input device 615(e.g., a keyboard, mouse, trackpad, touchscreen) that communicates userinput and instructions to the computer system 600. The peripheralinterface 612 may include interface circuitry and logic for local busessuch as Universal Serial Bus (USB) or other communication links.

Computer system 600 is coupled to a communication interface 616 thatprovides a link between the system bus 601 and an external communicationlink. The communication interface 616 provides a network link 618. Thecommunication interface 616 may represent an Ethernet or other networkinterface card (NIC), a wireless interface, modem, an optical interface,or other kind of input/output interface.

Network link 618 provides data communication through one or morenetworks to other devices. Such devices include other computer systemsthat are part of a local area network (LAN) 626. Furthermore, thenetwork link 618 provides a link, via an internet service provider (ISP)620, to the Internet 622. In turn, the Internet 622 may provide a linkto other computing systems such as a remote server 630 and/or a remoteclient 631. Network link 618 and such networks may transmit data usingpacket-switched, circuit-switched, or other data-transmissionapproaches.

In operation, the computer system 600 may implement the functionalitydescribed herein as a result of the processor executing code. Such codemay be read from or stored on a non-transitory computer-readable medium,such as memory 610, ROM 608, or storage device 606. Other forms ofnon-transitory computer-readable media include disks, tapes, magneticmedia, CD-ROMs, optical media, RAM, PROM, EPROM, and EEPROM. Any othernon-transitory computer-readable medium may be employed. Executing codemay also be read from network link 618 (e.g., following storage in aninterface buffer, local memory, or other circuitry).

What is claimed is: 1.-25. (canceled)
 26. A proxy server in a contentdelivery network (CDN) that has a plurality of proxy servers deliveringcontent to clients, the proxy server comprising: a network interface forreceiving a request for a markup language file associated with a webpage, the request being received from a client; circuitry forming one ormore processors and memory holding instructions to be executed by theone or more processors to cause the proxy server to respond to therequest by: obtaining the markup language file from any of an originserver, a CDN network storage subsystem, a cache hierarchy parent, and alocal cache; inserting code into the markup language file for executionby the client when processing the markup language file, the codecomprising instructions that instruct the client to gather timinginformation reflecting the client's processing of the markup languagefile in order to display the web page, and instructions that instructthe client to transmit the timing information in a beacon to the CDN;including in the code an identifier that uniquely identifies at leastone of: (i) a point of presence in the CDN in which the proxy serverresides; (ii) an element from which the proxy server retrieved themarkup language file, the network element comprising any of: the localcache, a cache-hierarchy parent, a CDN network storage subsystem, theorigin server; (iii) an element from which the proxy server file any ofretrieved and will retrieve an embedded object in the markup languagefile, the network element comprising any of: a local cache, acache-hierarchy parent, a CDN network storage subsystem, an originserver; the code including instructions specifying that the client willtransmit the identifier, as a field in the beacon, with the timinginformation; serving the markup language file with the inserted code tothe client, in response to the request.
 27. The proxy server of claim26, wherein the client comprises a web browser.
 28. The proxy server ofclaim 26, wherein the markup language file comprises an HTML file. 29.The proxy server of claim 26, wherein the code is JavaScript code thatis any of inlined into the markup language file and referenced by a URLinserted into the markup language file.
 30. The proxy server of claim26, wherein the timing information includes information specified in theNavigation Timing or Resource Timing specification.
 31. The proxy serverof claim 26, wherein the timing information includes one or moretimestamps for a processing phase at the client.
 32. The proxy server ofclaim 26, wherein the identifier that uniquely identifies: (i) a pointof presence in the CDN in which the proxy server resides.
 33. The proxyserver of claim 26, wherein the identifier that uniquely identifies:(ii) a network element from which the proxy server retrieved the markuplanguage file, the network element comprising any of a local cache, acache-hierarchy parent, a CDN network storage subsystem, the originserver.
 34. The proxy server of claim 26, the identifier that uniquelyidentifies: (iii) a network element from which the proxy server file anyof retrieved and will retrieve an embedded object in the markup languagefile, the network element comprising any of a local cache, acache-hierarchy parent, a CDN network storage subsystem, the originserver.
 35. A method performed by a proxy server in a content deliverynetwork (CDN) that comprises a network interface, circuitry forming oneor more processors, and memory holding instructions for execution by theone or more processors to perform the method, the method comprising:receiving a request for a markup language file for a web page, therequest being received from a client; in response to the request, theproxy server: obtaining the markup language file from any of an originserver, a CDN network storage subsystem, a cache hierarchy parent, and alocal cache; inserting code into the markup language file for executionby the client when processing the markup language file, the codecomprising instructions that instruct the client to gather timinginformation reflecting the client's processing of the markup languagefile in order to display the web page, and instructions that instructthe client to transmit the timing information in a beacon to the CDN;including in the code an identifier that uniquely identifies at leastone of: (i) a point of presence in the CDN in which the proxy serverresides; (ii) an element from which the proxy server retrieved themarkup language file, the network element comprising any of: the localcache, a cache-hierarchy parent, a CDN network storage subsystem, theorigin server; (iii) an element from which the proxy server file any ofretrieved and will retrieve an embedded object in the markup languagefile, the network element comprising any of: a local cache, acache-hierarchy parent, a CDN network storage subsystem, an originserver; the code including instructions specifying that the client willtransmit the identifier, as a field in the beacon, with the timinginformation; serving the markup language file with the inserted code tothe client, in response to the request.
 36. The method of claim 35,wherein the client comprises a web browser.
 37. The method of claim 35,wherein the markup language file comprises an HTML file.
 38. The methodof claim 35, wherein the code is JavaScript code that is any of inlinedinto the markup language file and referenced by a URL inserted into themarkup language file.
 39. The method of claim 35, wherein the timinginformation includes information specified in the Navigation Timing orResource Timing specification.
 40. The method of claim 35, wherein thetiming information includes one or more timestamps for a processingphase at the client.
 41. The method of claim 35, wherein the identifierthat uniquely identifies: (i) a point of presence in the CDN in whichthe proxy server resides.
 42. The method of claim 35, wherein theidentifier that uniquely identifies: (ii) a network element from whichthe proxy server retrieved the markup language file, the network elementcomprising any of a local cache, a cache-hierachy parent, a CDN networkstorage subsystem, the origin server.
 43. The method of claim 35, theidentifier that uniquely identifies: (iii) a network element from whichthe proxy server file any of retrieved and will retrieve an embeddedobject in the markup language file, the network element comprising anyof a local cache, a cache-hierarchy parent, a CDN network storagesubsystem, the origin server.